Fix probabilistic variance underflow by Panchadip-128 · Pull Request #527 · mllam/neural-lam

Panchadip-128 · 2026-03-27T20:32:21Z

Describe your changes

This PR resolves two critical numerical and logical flaws in the probabilistic forecasting (--output_std) engine:

Fixed softplus Underflow (NaN crashes): In neural_lam/models/base_graph_model.py, the standard deviation calculation is now clamped to a minimum of 1e-6. This prevents the network from producing a machine-zero variance which previously caused division-by-zero errors in NLL and CRPS metrics, leading to irreversible NaN training losses.
Corrected Ensemble Sampling: The ARModel._sample_ensemble method in neural_lam/models/ar_model.py was previously discarding the model's predicted uncertainty map in favor of a hardcoded 0.01 noise fallback. This has been refactored to utilize the model's dynamically predicted pred_std for physically grounded ensemble generation.
Regression Testing: Added test_base_graph_model_prevents_softplus_underflow_nans and test_ar_model_ensemble_samples_from_pred_std to tests/test_probabilistic_forecasting.py to assert that these mathematical stability and logic requirements are met.

Motivation and Context: These bugs made probabilistic training unstable and produced deceptively uniform ensemble spreads that ignored the model's internal confidence.

Dependencies: No new dependencies.

Issue Link

solves #526

Type of change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

My branch is up-to-date with the target branch (upstream/main rebased and force-pushed).
I have performed a self-review of my code.
For any new/modified functions/classes I have added docstrings.
I have placed in-line comments to clarify intent.
I have updated the README (Not applicable as these are internal bug fixes).
I have added tests that prove my fix is effective.
I have given the PR a name that clearly describes the change.

Author checklist after completed review

I have added a line to the CHANGELOG.md describing this change.
- fixes: softplus variance underflow and ensemble sampling logic.

…on PR)

…n ensemble generation

ronilmitra7 · 2026-03-27T21:52:29Z

Nice catch on the NaN crashes! I’ve definitely been there with PINNs and seen how one zero variance can just blow up a whole training run. Clamping at 1e-6 is a solid move for keeping the NLL/CRPS stable.

Also, good to see that 0.01 noise replaced with the actual pred_std in the ensemble logic. I’m curious, did you run into these NaN crashes mostly during the initial epochs or during longer auto-regressive rollouts?

Panchadip-128 added 2 commits March 28, 2026 02:05

Add minimal probabilistic output interface for ARModel (GSoC foundati…

3f922d5

…on PR)

fix(probabilistic): prevent softplus underflow and utilize pred_std i…

bb30e20

…n ensemble generation

Panchadip-128 force-pushed the fix-probabilistic-variance-underflow branch from e6e74b2 to bb30e20 Compare March 27, 2026 20:37

Panchadip-128 marked this pull request as ready for review March 27, 2026 20:48

Panchadip-128 marked this pull request as draft March 27, 2026 20:48

fix(probabilistic): harden ensemble/std workflows and metrics logging

35e5412

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix probabilistic variance underflow#527

Fix probabilistic variance underflow#527
Panchadip-128 wants to merge 3 commits intomllam:mainfrom
Panchadip-128:fix-probabilistic-variance-underflow

Panchadip-128 commented Mar 27, 2026 •

edited

Loading

Uh oh!

ronilmitra7 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Panchadip-128 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue Link

Type of change

Checklist before requesting a review

Author checklist after completed review

Uh oh!

ronilmitra7 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Panchadip-128 commented Mar 27, 2026 •

edited

Loading